Text Recognition and Translation of Multi-Oriented, Multi-Language and Curved Text in Natural Scene Images

نویسنده

  • Cristina Vassallo
چکیده

This study is about text detection and recognition in natural scene images. The main focus is on the detection, recognition and eventually, translation, of multi-oriented, multi-language and curvilinear text in such images. The study attempts to provide a solution that can detect and recognise such text since current leading mobile applications such as Word Lens and Google Goggles do not support such text for translation. There are many algorithms available that can detect and recognise text, but very few consider text which is multi-oriented, multi-script or curvilinear. Text detection can be carried out using various methods including region-based and texture-base methods. Furthermore, algorithms for multi-oriented text detection are further divided into non-headline and headline based methods. Three different solutions were considered in this study, one being an algorithm developed specifically for maps in which text usually has various orientations, curvatures and sizes. Another option was a framework that performed detection and recognition simultaneously. The third option was a combination of two algorithms, one for detection and one for aligning curved and multi-oriented text. The research carried out consisted of implementing the final option and integrating these two algorithms to achieve a system that could detect and recognize multi-oriented, multi-script and curvilinear text. Tests were carried out using the proposed system by using different data sets that are publicly available for testing such systems. Results were recorded according to the criteria of time, precision and recall. In addition, the proposed system was compared to two leading applications: Word Lens and Google Goggles. The result of this study was that the resulting system was not capable of performing detection and recognition of all kinds of images and text. On the other hand, it was able to give a solution for issues that other leading applications face, such as the detection of multi-oriented and multi-script text. Results can be affected by various aspects, including the libraries and languages used to implement an algorithm. In addition, the device used and its processing power are highly related to the performance and probably also to the precision of a system. Other factors that need to be considered when implementing such a system are the datasets that will be used for testing and the experience and knowledge one has in the area.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A new model for persian multi-part words edition based on statistical machine translation

Multi-part words in English language are hyphenated and hyphen is used to separate different parts. Persian language consists of multi-part words as well. Based on Persian morphology, half-space character is needed to separate parts of multi-part words where in many cases people incorrectly use space character instead of half-space character. This common incorrectly use of space leads to some s...

متن کامل

Natural scene text localization using edge color signature

Localizing text regions in images taken from natural scenes is one of the challenging problems dueto variations in font, size, color and orientation of text. In this paper, we introduce a new concept socalled Edge Color Signature for localizing text regions in an image. This method is able to localizeboth Farsi and English texts. In the proposed method rst a pyramid using diff...

متن کامل

Scene Text Detection via Holistic, Multi-Channel Prediction

Recently, scene text detection has become an active research topic in computer vision and document analysis, because of its great importance and significant challenge. However, vast majority of the existing methods detect text within local regions, typically through extracting character, word or line level candidates followed by candidate aggregation and false positive elimination, which potent...

متن کامل

E2E-MLT - an Unconstrained End-to-End Method for Multi-Language Scene Text

An end-to-end method for multi-language scene text localization, recognition and script identification is proposed. The approach is based on a set of convolutional neural nets. The method, called E2E-MLT, achieves state-of-theart performance for both joint localization and script identification in natural images and in cropped word script identification. E2E-MLT is the first published multi-lan...

متن کامل

A Dataset and Evaluation Metric for Coherent Text Recognition from Scene Images

In this paper, we deal with extraction of textual information from scene images. So far, the task of Scene Text Recognition (STR) has only been focusing on recognition of isolated words and, for simplicity, it omits words which are too short. Such an approach is not suitable for further processing of the extracted text. We define a new task which aims at extracting coherent blocks of text from ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015